Throughput/Precision Computation Of Convolution In Programmable Processors
نویسندگان
چکیده
Convolution and cross-correlation are the basis of filtering and pattern or template matching in digital signal processing (DSP). We propose a throughput scaling technique for any one-dimensional convolution kernel in programmable processors by adjusting the imprecision (distortion) of computation. Our approach is based on scalar quantization, followed by a new form of tight packing in floating-point that allows for concurrent calculation of multiple results. Indicative experimental results with a digital music matching system demonstrate that the proposed approach offers up to 112% increase in processing throughput against optimized convolution with no effect in the accuracy of the application results.
منابع مشابه
Accelerating Seismic Computations Using Customized Number Representations on FPGAs
The oil and gas industry has an increasingly large demand for high-performance computation over huge volume of data. Compared to common processors, field-programable gate arrays (FPGAs) can boost the computation performance with a streaming computation architecture and the support for application-specific number representation. With hardware support for reconfigurable number format and bit widt...
متن کاملField Programmable Gate Array Implementation of Active Control Laws for Multi-mode Vibration Damping
This paper investigate the possibility and effectiveness of multi-mode vibration control of a plate through real-time FPGA (Field Programmable Gate Array) implementation. This type of embedded system offers true parallel and high throughput computation abilities. The control object is an aluminum panel, clamped to a Perspex box’s upper side. Two types of control laws are studied. The first belo...
متن کاملHigh Speed DWT Processor Implementation in FPGA
This paper presents a high speed and area efficient DWT processor VLSI based design for Image Compression applications. In this proposed design, pipelined partially serial architecture has been used to enhance the speed along with optimal utilization and resources available on the target FPGA. The architecture consists of two row processors, two column processors, and two memory modules. Each p...
متن کاملA Rotation-based Data Buffering Architecture for Convolution Filtering in a Field Programmable Gate Array
Convolution filtering applications range from image recognition and video surveillance. Two observations drive the design of a new buffering architecture for convolution filters. First, the convolutional operations are inherently local; hence every pixel of the output feature maps is calculated by the neighboring pixels of the input feature maps. Even though the operation is simple, the convolu...
متن کاملPrecision-Energy-Throughput Scaling of Generic Matrix Multiplication and Convolution Kernels via Linear Projections
Generic matrix multiplication (GEMM) and onedimensional convolution/cross-correlation (CONV) kernels often constitute the bulk of the computeand memory-intensive processing within image/audio recognition and matching systems. We propose a novel method to scale the energy and processing throughput of GEMM and CONV kernels for such error-tolerant multimedia applications by adjusting the precision...
متن کامل